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Abstract 

The main objective of this project is to develop multiple human object tracking approach based 
on motion estimation and detection, background subtraction, shadow removal and occlusion 
detection. A reference frame is initially used and considered as background information. While a 
new object enters into the frame, the foreground information and background information are 
identified using the reference frame as background model. Most of the times, the shadow of the 
background information is merged with the foreground object and makes the tracking process a 
complex one. The algorithm involves modeling of the desired background as a reference model 
for later used in background subtraction to produce foreground pixel which is deviation of 
the current frame from the reference frame. In the approach, morphological operations will be 
used for identifying and removed the shadow. The occlusion is one of the most common events in 
object tracking and object centroid of each object is used for detecting the occlusion and 
identifying each object separately. Video sequences will be captured and will be detected with 
the proposed algorithm. 

Keywords: Background modeling and subtraction, human motion detection, object tracking, 
shadow removal, occlusion. 
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I. Introduction 

Object tracking can be defined as the process of segmenting an object of interest from a video 
scene and keeping track of its motion, orientation, occlusion etc. in order to extract useful 
information. Object tracking in video processing follows the segmentation step and is more or 
less equivalent to the 'recognition' step in the image processing. Detection of moving objects in 
video streams is the 



first relevant step of information extraction in many computer vision applications, including 
traffic monitoring, automated remote video surveillance, and people tracking. 

The capability of extracting moving objects from a video sequence is a fundamental and 
crucial problem of many vision systems that include video surveillance [1,2], traffic monitoring 
[3], human detection and tracking for video teleconferencing or human-machine interface [4, 5, 
6], video editing, among other applications. 

In applications using fixed cameras with respect to the static background (e.g. stationary 
surveillance cameras), a very common approach is to use background subtraction to obtain an 
initial estimate of moving objects. Basically, background subtraction consists of comparing each 
new frame with a representation of the scene background: significative differences usually 
correspond to foreground objects. Ideally, background subtraction should detect real moving 
objects with high accuracy, limiting false negatives (objects pixels that are not detected) as much 
as possible; at the same time, it should extract pixels of moving objects with the maximum 
responsiveness possible, avoiding detection of transient spurious objects, such as cast shadows, 
static objects, or noise. I | '\JP I I ^ Mmtmmk 

In this paper, we present a shadow removal technique which effectively eliminates a human 
shadow cast from an unknown direction of light source. A multi-cue shadow descriptor is 
proposed to characterize the distinctive properties of shadows. We employ a 3-stage process to 
detect then remove shadows. Our algorithm improves the shadow detection accuracy by 
imposing the spatial constraint between the foreground subregions of human and shadow. 

The existence of human shadows is a general problem in tracking and recognizing human 
activities. Shadows not only distort the color properties of the area being shaded but also 
complicate the edge structure of the figure as a whole. There are several factors that together 
determine the appearance of a shadow, for example, the view point of camera, the angle of 
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incidence, the light intensity, and the number of light sources, etc. Further, under the sun, the 
dominant orientation of a human shadow changes as a function of time. Therefore, a human 
tracker becomes more prone to miss the target, and the motion pattern of a single action varies 
considerably. For simplification, by human shadow we mean a human cast shadow in contrast 
with a human self shadow 



2. Desired implication 



2.1 Object Tracking 

Basic steps in object tracking can be listed as: 
1. Segmentation 

2. Foreground / background extraction 

3. Camera modeling 

4. Feature extraction and tracking 



2.1.1. Segmentation 

Segmentation is the process of identifying components of the image. Segmentation involves 
operations such as boundary detection, connected component labeling, thresholding etc. 
Boundary detection finds out edges in the image. Any differential operator can be used for 
boundary detection [7,8]. Thresholding is the process of reducing the grey levels in the image. 
Many algorithms exist for thresholding [7, 8]. Refer [8] for connected component labeling 
algorithms. W I % MmmaMk 



2.1.2 Foreground extraction 

As the name suggests this is the process of separating the foreground and background of the 
image. Here it is assumed that foreground contains the objects of interest. 



2.1.3. Background extraction 

Once foreground is extracted a simple subtraction operation can be used to extract the 
background [1] . Following figure illustrates this operation: 
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Another method that can be used in object tracking is Background learning. This approach can 
be used when fixed cameras are used for video capturing. In this method, an initial training step 
is carried out before deploying the system. In the training step the system constantly records the 
background in order to 'learn' it. Once the training is complete the system has complete (or 
almost complete) information about the background. Though this step is slightly lengthy, it has a 
very important advantage. Once we know the background, extracting the foreground is matter of 
simple image subtraction ! 

2.1.3. Camera modeling 

Camera model is an important aspect of any object-tracking algorithm. All the existing 
objects tracking systems use a preset camera model. In words camera model is directly derived 
from the domain knowledge are required to adjust all the inputs. This what is done in [10]. For a 
moving camera, we need some heuristic about camera motion. If exact information about the 
camera movement is available then it can be included in the form of transformations. Having 
multiple moving cameras is very complicated situation (but can be faced with in many real world 
applications). It needs the algorithm to model motion of all the cameras as well as to integrate 
results from all the cameras. 

2.1.4. Feature 

This is an area of image processing that uses algorithms to detect and isolate various desired 
portions of a digitized image. A feature is a significant piece of information extracted from an 
image which provides more detailed understanding of the image. Feature extraction involves 
simplifying the amount of resources required to describe a large set of data accurately. When 
performing analysis of complex data one of the major problems stems from the number of 
variables involved. Feature extraction is a general term for methods of constructing combinations 
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of the variables to get around these problems while still describing the data with sufficient 



accuracy. 

3. Shadow removal technique 



3.1. Algorithm 

The flowchart in figure 1 shows the main algorithm of the project that has been proposed and 
the resultant image for every process is shown with an example in figure 4. It has been assumed 
that the input (object's blob and background's blob) is obtained from some background 
subtraction. All process in figure 1 is explained in the following sections. 
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Figl: Overall algorithm of proposed shadow removal technique 



3.1.1. Image Division 

In this process, the object's blob, ob(x, y), {x, y € Z 2 } is divided with the background's blob, 
bk(x, y), {x,y € Z 2 } It has been said before that the purpose of image division is to highlight the 
homogeneity property of shadows. Resultant image after the division process is multiplied with a 
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constant for the purpose of increasing the signal of the resultant image. In this case,the constant 
value is 100 (Eq.l). The result of this process is define as Img_ Div(x, y). 



Img_Div(x,y) = — ^ x 100, Vx £X t Vy£Y (1) 

bk(x, y) 



3.1.2. Thresholding 

The purpose of thresholding is to decide the shadow's blob in the resultant image after the 
image division process (Img-Div). In this proposed technique, the range has been set according 
to the scene (Eq.2) and this is done by studying the histogram of the division image over a few 
samples. 

. [it min < /mj Div < t max , , 

3.1.3 Filtering 

The purpose of filtering is to enhance the resultant image after the thresholding process 
(Img_Th) and to find the biggest blob which is predicted as the shadow's blob or shadow region. 
Filtering process include filling, erosion and dilation to enhance the image and labeling to predict 
the shadow. It is assumed that the biggest blob after the labeling process or connected component 
process as a shadow region. ^^^^^^ 



3.1.4 Boundary Removal 

The purpose is to remove the penumbra region of shadow, or in other words to remove the 
shadow's boundary. The first step in this process is to get the coordinates of the boundary 
(object's blob). This is also called as boundary tracing process. After that, each boundary pixel 
and its neighbor is checked whether it is a shadow pixel or not. In this case, neighbor pixels that 
are only located in the horizontal, vertical, and diagonal (45 and -45 degree) of the boundary 
pixel with certain offset (range between neighbor pixels and boundary pixel) are checked. 



3.1.5 Removal Validation 

Removal validation process consists of two sub processes which are the percentage checking 
process and the Vertical Scan process. The purpose of the Percentage Checking process is to 
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check whether the removal process was correct or not. This is done by checking the percentage 
of area that has been removed in the removal process (Eq.3), where BR represent the percentage 
of area that has been removed over the area of whole object's blob. Based on the study and 
analysis of sample images, the shadow removal is correct if the percentage value is within a 
range that is dependent on a scene (Eq.4), where RV, percent-min and percent_max represent 
removal validation result, minimum percentage and maximum percentage (the range). This 
percentage range will be explained later in section IV. If the percentage value does not fall in that 
range, it is assumed that the removal did not work correctly. 



area thai has been removed 
object's blob area 



x 100 



(3) 



RV 



-I 



true, percent, mi n < BR < percent jnax 
false, otherwise 



(4) 



The second sub process is the Vertical Scan process which will check which part of the object's 
blob is predicted as a shadow region. In the Filtering process, it is assumed that the biggest blob 
after that labeling process is the shadow's region. However based on the study of input samples, 
sometimes, the second biggest blob is the correct shadow region and the biggest blob is not a 

shadow ^^^^^^^^H^^^^^^^H^V^^r^ Jnl 

Figure 3 shows an example where the Filtering process has done a wrong prediction and based 

on the analysis this biggest blob (wrong predicted shadow's region) is always located at the 

center of the object's blob. JT ^ 




Correct Output 

Figure 3, Vertical scan to determine correct removal 
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so in the vertical scan process, a vertical scan is performed though the centroid of the object's 
blob just to make sure that the predicted shadow's region is not located at the center of object's 
blob. However, the Vertical Scan process can only be applied on certain scenes. Some scenes are 
not suitable because it will only cause a poorer result. 
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Fig 4. Shadow removal process 

In each scene, the vehicle is monitored and analyzed for a period of time and the overall success 
rate is calculated. This is calculated by the percentage of result from every video sample, PVi {i= 
l..N}(by getting the number of frames that have the correct result over the number of frames) 
and then, to get the average percentage of correct result from video samples in the same scene 
(PSS). Eq. 5 and Eq. 6 are the formulas that are applied in this analysis where the PV, and PSS 
represent the percentage of correct result from a video samples, number of video samples in a 
scene and percentage of correct removal in a scene. 

r\o, of frames with correct result 

PV = — (5) 

no, of frames in video sample 
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rss = 



N 



(6) 



4. Proposed work and objectives 

The main objective of this project is to develop an algorithm that can detect human motion at 
certain distance for object tracking applications. Various tasks are carried out such as motion 
detection, background modeling and subtraction, foreground detection, shadow detection and 
removal, morphological operations and identifying occlusion. 



5. Conclusion 

In this paper, an approach capable of detecting motion and extracting object information 
which involves human as object will be described. The algorithm involves modeling of the 
desired background as a reference model for later used in background subtraction to produce 
foreground pixels which is the deviation of the current frame from the reference frame. The 
deviation which represents the moving object within the analyzed frame is further processed to 
localize and extracts the information.We present an effective technique to remove human 
shadows. Our method has led to accurate recognition of avtivities. 
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